In this module, we move from the traditional paradigm of weight-based fine-tuning to the dynamic world of In-Context Learning (ICL). We explore how Large Language Models (LLMs) achieve task mastery not by altering their internal architecture, but by leveraging the structure of the prompt itself to navigate complex latent spaces.
1. From Telling to Showing
While an instruction provides a general direction, "imitation" through input-output pairs $(x, y)$ acts as a non-parametric guide. These examples serve as statistical anchors that narrow the model's probability distribution, reducing the ambiguity inherent in raw natural language instructions.
2. The Mechanics of Attention
ICL relies on the Transformer’s attention mechanism to perform "task induction." By identifying regularities within your provided sequence, the model locates a specific functional mapping in its high-dimensional space, allowing it to emulate styles and structures with high precision.
Goal: Provide a three-exemplar few-shot prompt that teaches the model a specific "Concise Executive" style, rather than just a generic professional tone.
Adjectives like "Concise" are subjective and have broad probability distributions; examples provide a concrete structural template that the attention mechanism can emulate with mathematical precision.